77 research outputs found

    Contribution au Segmentation Panoptique

    Get PDF
    Full visual scene understanding has always been one of the main goals of machine perception. The ability to describe the components of a scene using only information taken by a digital camera has been the main focus of computer vision tasks such as semantic segmentation and instance segmentation. Where by using Deep Learning techniques, a neural network is capable to assign a label to each pixel of an image (semantic segmentation) or define the boundaries of an instance or object with more precision than a bounding box (instance segmentation). The task of Panoptic Segmentation tries to achieve a full scene description by merging semantic and instance segmentation information and leveraging the strengths of these two tasks. On this report it is shown a possible alternative to solve this merging problem by using Convolutional Neural Networks (CNNs) to refine the boundaries between each class.La compréhension visuelle complète de la scène a toujours été l’un des objectifs principaux de la perception de la machine. La capacité à décrire les composants d’une scène en utilisant uniquement les informations prises par un appareil photo numérique a été le principal objectif des tâches de vision par ordinateur telles que la segmentation sémantique et la segmentation d’instances. En utilisant des techniques d’apprentissage en profondeur, un réseau de neurones est capable d’attribuer une étiquette à chaque pixel d’une image (segmentation sémantique) ou de définir les limites d’une instance ou d’un objet avec plus de précision que le cadre de sélection (segmentation d’instance).La tâche de segmentation panoptique proposée par Kirillov et. al tente d’obtenir une description complète de la scène en fusionnant les informations de segmentation sémantique et par instance et en exploitant les points forts de ces deux tâches. Ce rapport indique une alternativepossible pour résoudre ce problème de fusion en utilisant des réseaux de neurones à convolution (CNN) pour affiner les limites entre chaque classe

    Segmentation Panoptique basé sur YOLO

    Get PDF
    International audienceGiven the recent challenge of Panoptic Segmentation, where every pixel in an image must be given a label, as in semantic segmentation, and an instance id, a new YOLO-based architecture is proposed here for this computer vision task. This network uses the YOLOv3 architecture, plus parallel semantic and instance segmentation heads to perform full scene parsing. A set of solutions for each of these two segmentation tasks are proposed and evaluated, where a Pyramid Pooling Module is found to be the best semantic feature extractor given a set of feature maps from the Darknet-53 backbone network. The network gives good segmentation results for both stuff and thing classes by training with a frozen backbone, where boundaries between background classes are consistent with the ground truth and the instance masks match closely the true shapes of the objects present in a scene.Compte tenu du défi récent de la segmentation panoptique, où chaque pixel d’une image doit recevoir une étiquette, comme dans la segmentation sémantique,et un identifiant d’instance, une nouvelle architecture basée sur YOLO est proposée ici pour cette tâche de vision par ordinateur. Ce réseau utilise l’architectureYOLOv3, ainsi que des têtes de segmentation sémantique et d’instance parallèles pour effectuer une analyse complète de la scène. Un ensemble de solutions pour chacune de ces deux tâches de segmentation est proposé et évalué, où un Pyramid Pooling Module se révèle être le meilleur extracteur de caractéristiques sémantiques compte tenu d’un ensemble de caractéristiques du réseau de base Darknet-53. Le réseau donne de bons résultats de segmentation pour les classes de choses et d’objets en s’entraînant avec une backbone figée, où les frontières entre les classes d’arrière-plan sont cohérentes avec la ground-truth et les masques d’instance correspondent étroitement aux vraies formes des objets présents dans une scène

    Vehicle Motion Forecasting using Prior Information and Semantic-assisted Occupancy Grid Maps

    Full text link
    Motion prediction is a challenging task for autonomous vehicles due to uncertainty in the sensor data, the non-deterministic nature of future, and complex behavior of agents. In this paper, we tackle this problem by representing the scene as dynamic occupancy grid maps (DOGMs), associating semantic labels to the occupied cells and incorporating map information. We propose a novel framework that combines deep-learning-based spatio-temporal and probabilistic approaches to predict vehicle behaviors.Contrary to the conventional OGM prediction methods, evaluation of our work is conducted against the ground truth annotations. We experiment and validate our results on real-world NuScenes dataset and show that our model shows superior ability to predict both static and dynamic vehicles compared to OGM predictions. Furthermore, we perform an ablation study and assess the role of semantic labels and map in the architecture.Comment: Accepted to the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2023

    YOLO-based Panoptic Segmentation Network

    Get PDF
    International audienceAutonomous vehicles need information about their surroundings to safely navigate them. For this, the task of Panoptic Segmentation is proposed as a method of fully parsing the scene by assigning each pixel a label and instance id. Given the constraints of autonomous driving, this process needs to be done in a fast manner. In this paper, we propose the first panoptic segmentation network based on the YOLOv3 real-time object detection network by adding a semantic and instance segmentation branches. YOLO-panoptic is able to do real-time inference and achieves a performance similar to the state of the art methods in some metrics

    Instance Segmentation with Unsupervised Adaptation to Different Domains for Autonomous Vehicles

    Get PDF
    International audienceDetection of the objects around a vehicle is important for a safe and successful navigation of an autonomous vehicle. Instance segmentation provides a fine and accurate classification of the objects such as cars, trucks, pedestrians, etc. In this study, we propose a fast and accurate approach which can detect and segment the object instances which can be adapted to new conditions without requiring the labels from the new condition. Furthermore, the performance of the instance segmentation does not degrade in detection of the objects in the original condition after it adapts to the new condition. To our knowledge, currently there are not other methods which perform unsupervised domain adaptation for the task of instance segmentation using non-synthetic datasets. We evaluate the adaptation capability of our method on two datasets. Firstly, we test its capacity of adapting to a new domain; secondly, we test its ability to adapt to new weather conditions. The results show that it can adapt to new conditions with an improved accuracy while preserving the accuracy of the original condition

    Early alterations of B cells in patients with septic shock

    Get PDF
    Abstract Introduction It has recently been proposed that B lymphocytes are involved in sepsis pathogenesis. The goal of this study is to investigate potential abnormalities in a subset distribution and activation of circulating B lymphocytes in patients with septic shock. Methods This observational prospective study was conducted in a medical-surgical ICU. All patients with septic shock were eligible for inclusion. B-cell phenotypes (CD19+CD69+, CD19+CD23+, CD19+CD5+, CD19+CD80, CD19+CD86+, CD19+CD40 and CD19+CD95+) were assessed by quantitative flow cytometry upon admission to the ICU and 3, 7, 14 and 28 d later. Results Fifty-two patients were included. Thirty-six healthy volunteers matched for age and sex were used as controls. The patients had lymphopenia that was maintained during 28 d of follow-up. In patients with septic shock who died, the percentage of CD19+CD23+ was lower during the 7 d of follow-up than it was in survival patients. Moreover, the percentage of CD80+ and CD95+ expression on B cells was higher in patients who died than in survivors. Receiver operating characteristic curve analysis showed that a CD19+CD23+ value of 64.6% at ICU admission enabled discrimination between survivors and nonsurvivors with a sensitivity of 90.9% and a specificity of 80.0% (P = 0.0001). Conclusions Patients with septic shock who survive and those who don't have different patterns of abnormalities in circulating B lymphocytes. At ICU admission, a low percentage of CD23+ and a high of CD80+ and CD95+ on B cells were associated with increased mortality of patients with septic shock. Moreover, a drop in circulating B cells persisted during 28 d of ICU follow-up.This work was partially funded by grants from Fondo de Investigación de la Seguridad Social, Ministerio de Economia y Competitividad (MEC) (Spain), Consejeria de Educación, Comunidad de Madrid, MITIC-CM (S-2010/BMD-2502) and Instituto de Salud Carlos III, MEC (PI051871, CIBERehd).Peer Reviewe

    Lane Detection and Trajectory Generation System

    Get PDF
    International audienceThis paper presents the development of a perception system that enables an Ackermann-type autonomous vehicle to move through urban environments using control commands based on short-term trajectory planning. We propose a lane detection and keeping system based on computer vision techniques that are computationally efficient. Also, a Kalman filter-based estimation module was added to gain robustness against illumination changes and shadows. Additionally, the simulation and control of the Autónomo Uno robot gave good results following the steering commands to keep the position. In the simulation the controllers had some slight noise problems but the robot executed the given steering commands and it moved following the road. This behavior was also seen in the physical implementation

    LAPTNet-FPN: Multi-scale LiDAR-aided Projective Transform Network for Real Time Semantic Grid Prediction

    Get PDF
    International audienceSemantic grids can be useful representations of the scene around an autonomous system. By having information about the layout of the space around itself, a robot can leverage this type of representation for crucial tasks such as navigation or tracking. By fusing information from multiple sensors, robustness can be increased and the computational load for the task can be lowered, achieving real time performance. Our multi-scale LiDAR-Aided Perspective Transform network uses information available in point clouds to guide the projection of image features to a top-view representation, resulting in a relative improvement in the state of the art for semantic grid generation for human (+8.67%) and movable object (+49.07%) classes in the nuScenes dataset, as well as achieving results close to the state of the art for the vehicle, drivable area and walkway classes, while performing inference at 25 FPS

    LAPTNet: LiDAR-Aided Perspective Transform Network

    Get PDF
    International audienceSemantic grids are a useful representation of the environment around a robot. They can be used in autonomous vehicles to concisely represent the scene around the car, capturing vital information for downstream tasks like navigation or collision assessment. Information from different sensors can be used to generate these grids. Some methods rely only on RGB images, whereas others choose to incorporate information from other sensors, such as radar or LiDAR. In this paper, we present an architecture that fuses LiDAR and camera information to generate semantic grids. By using the 3D information from a LiDAR point cloud, the LiDAR-Aided Perspective Transform Network (LAPTNet) is able to associate features in the camera plane to the bird's eye view without having to predict any depth information about the scene. Compared to state-of-theart camera-only methods, LAPTNet achieves an improvement of up to 8.8 points (or 38.13%) over state-of-art competing approaches for the classes proposed in the NuScenes dataset validation split

    TransFuseGrid: Transformer-based Lidar-RGB fusion for semantic grid prediction

    Get PDF
    International audienceSemantic grids are a succinct and convenient approach to represent the environment for mobile robotics and autonomous driving applications. While the use of Lidar sensors is now generalized in robotics, most semantic grid prediction approaches in the literature focus only on RGB data. In this paper, we present an approach for semantic grid prediction that uses a transformer architecture to fuse Lidar sensor data with RGB images from multiple cameras. Our proposed method, TransFuseGrid, first transforms both input streams into topview embeddings, and then fuses these embeddings at multiple scales with Transformers. Finally, a decoder transforms the fused, top-view feature map into a semantic grid of the vehicle's environment. We evaluate the performance of our approach on the nuScenes dataset for the vehicle, drivable area, lane divider and walkway segmentation tasks. The results show that Trans-FuseGrid achieves superior performance than competing RGBonly and Lidar-only methods. Additionally, the Transformer feature fusion leads to a significative improvement over naive RGB-Lidar concatenation. In particular, for the segmentation of vehicles, our model outperforms state-of-the-art RGB-only and Lidar-only methods by 24% and 53%, respectively
    corecore